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Abstract 

The possibilities and problems in the use of within-group slopes 
of outcomes on inputs as indicators of 'substantive group effects are 
considered. Slopes are proposed as outcome measures which may reflect 
within-group processes in between-group analyses of multilevel data. 
Research on aptitude x treatment interactions, contextual effects and 
school effects provide a theoretical rationale for the proposed methodol 
ogy. Data from the lEA Six Subject Survey are used to, illustrate how 
a group-level analysis with slopes as -outcomes might look. Finally, 
the statistical, empirical , substantive, and communi cation problems 
* that' arise from the use of slopes as outcomes are discussed. 



Slopes As* Outcomes 

t 

In recent years, there has been an increasing*, awareness that a 
thorough investigation of the effects of .educational processes requires 
a multiVevel examination of educational data (Burstein, 1980a, 1980c; 
Cronbach, 1976; Haney, 1974). Because of its multilevel (more precisely, 
hierarcjiical ) organization, the effects of Schooling on individual pupil 
performance can exist both, between and within the levels of the educational 
system/ Moreover, analyses at different levels address different ques- 
tions and analyses conducted at a single level in such contexts have 
inherent problems. ' , , 

Though choosing a unit pf analysis dominated past discussions, 
expecially in program evaluation (cf. e.g.. House, Glass, McLean, & 
Walker, 1978), current emphasis has shifted toward lettihg the choice of 
analytical model be dictated by the substantive processes under inves- 
tigation (Burstein, 1980a, 1980b, 1980c; Burstein & Miller, 1978; 
Cronbach, 1976; Haney, 1974). That is, investigators are devoting 
greater attention to the development of adequate theories of educational 
processes and the determination of analytical methods for identifying 
the effects of such processes. These two activities are the basic 
elements in the specification of appropriate analytical models. 

Within the domain of research on educational effects, the sub- 
stantive processes in operation are functions of the characteristics of 
pupils (e.g., aptitude, previous exposure, motivation), characteristics 
of the classroom (e.g., instructional content and organization, peer 
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abilities and support, teacher style) and characteristics of the school 
(e.g., physical resources, academic atmosphere). Moreover, these 
substantive processes have collective (for the class or schqol as a 
vvhole) as well as individual effects (e.g., Burstein, 1976, l^Oa, 
1980b, 1980CJ Burstein & Miller, 1978, 1980; Cronbach, 1976; WileV 1970). 
Given these features of multilevel educational data, the primary diffi- 
culties in proper model specification are the determination of the key \^ 
substantive questions and the identification of evidence from the 
multiple levels that can potentially resolve them. 

' There are various aspects of the problem of proper model specif ica- 
tion with multilevel data. On the one hand, there is a need's for clearer 
conceptualization of the connections between properties of groups (ability 
level, cohesiveness) and the processes within groups (learning, inter- 
action, participation). On the other hand, the special features of 
multilevel educational data call for special analytical methods designed 
for their examination. 

This paper represents one attempt to mold analytical methodology 
to the special needs of multilevel data. Specifically, its purpose is 
to consider the possibilities and problems in the use of wi thin-group 
slopes of outcomes on inputs as alternative indicators of substantive 
educational effects. 

Theoretical Rationale 

For the remainder of^the -paper, we restrict our attention to various 
types of field studies of educational effects and assume that the sub- 
stantive questions of interest warrant group-level (classroom or school) 
analyses. For example, an investigator might be interested in performance 



differences of classrooms which vary in their degree of structuring, 
emphasis on basic skills, or emphasis on cooperation. In, such cases, it 
is possible to view the sampled classrooms ""as alternative "treatments" 
which vary along multiple dimensions and examine the relationships between 
a class's scores on the various dimensions and its outcomes. Much of the 
process-product research on teacher effectiveness (e.g., Anderson, Evertson, 
& Brophy, 1978; Brophy & Evertson, 1974), work on education production 
functions (e.g., Averch, Carroll, Donaldson, Kiesling, & Pincus, 1972), 
and school effects research (e.g., Coleman, Campbell, Hobson, McPartland, 
Mood, Weinfeld, & York, 1966; Cojiiber & Keeve^, 1973) fits the above descrip- 
tion. To some degree, large-scale evaluations of educational interventions 
such as Project Follow Through (House, et al., 1978; Stebbins, St. Pierre, 
Proper, Andersen, & Cerva, 1977) can be viewed in a similar fashion 
(Rogosa, 1978). 

Regardless of the type of field study being conducted, once it has 
been determined that the substantive questions of interest warrant 
examination of differences among groups, the type of betweeh-group effects 
one expects to find remain to be specified. While analyses of the relation- 
ships between "treatment" dimensions and the mean outcomes of groups often 
provide useful information, important differences in wi thin-group 
processes may be obscured. These wi thin-group processes may arise due 
to group composition (e.g., ability level and mixture affecting partici-' 
pati on" patterns)., differential allocation of instructional resources among 
the members of the group (e.g., the grouping and pacing features of 
reading instruction), or differential reactions of group members to the 
same instructional treatment (aptitude-treatment interactions). 

If important group-to-group differences in within-grotjp processes 
exist, then the use of group means as the only indicator of group outcomes 

/ 
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can result in misleading or, at least, incomplete estimates of group 
{teacher/class/school /treatment) effects. In such cases, other indices 
of group outcomes such as the standard deviation (Brown & Saks, 1975; 
Klitgaard, 1975; Lohnes, 1972) should be considered. 

Our interest in alternative measures of group outcomes has concen- 
trated on thii properties of the within-group slopes from the regression 
of outcomes on input (Burstein, 1976, 1980a, 1980b; Burstein, Linn, & 
Capell, 1978; Burstein & Miller, 1978, 1980). Wi thin-group slopes may 
be viewed as group-level indicators of wi thin-group processes. Moreover, 
differences in slopes across groups can be the result of substantive 
educational effects. 

That is, we suggest that variation in slopes across groups can 
reflect the influence of group characteristics such as the level and 
distribution of instructional resources. For example, the relationship 
of ability to achievement within classes with educational "treatments" 
involving high levels of emphasis on grouiDjiig and pacing may differ 
markedly from classes with low levels ofi^these "treatment" characteris- 
tics. Under circumstances where classrooms differ on what are perceived 
to be important instructional charactey;i sties, it seems logical to inquire 
about whether, ceteri s paribus , these differences are systematically 
related to variation in the within-class relationship of ability to 
achievement. If such* relationships exist, then it can be argued that the 
v/i thin-group slope, a group-level outcome, varies as a function of a 
wi thin-group process. Later on we provide some caveats about attempts 
to account for slope differences. I 

To our knowledge the specific features of our approach for^ analyzing 

I ! 

variation in wi thin-groups sldpes have not been previously investigated 

ERIC / ' 

/ 



I 



in educational research. Interest in the potential substantive importance 
of heterogeneity of within-group regressions is, however, not new. 
Slope heterogeneity Is studied in psychological research on aptitude- 
treatment interactions, in sociological research on conte;ofc effects, and 
in work on interactive effects of opportunity to learn. Before describing 
our own conceptual and empirical work on slopes asjndicators of group 
outcomes, we review briefly the literature on these topics. 
Heterogeneous Slopes as Aptitude-T r eatment Interactions 

Research on aptitude- treatment interactions (ATI; Berliner & Cahen, 
1974; Cronbach, 1976; Cronbach & Snow, 1977; Cronbach & Webb, 1975; Snow, 
1976) provide the original impetus for our examination of within-group 
slopes. The logic of ATI research is built on the substantive significance 
of differences in wi thin-treatment regressions. For example, it may be 
theorized that a hj^hly structured presentation might lead to a weaker 
relationship between entering aptitude and final achievement than would 
a treatment with less structure; or a competitive treatment would lead to 
a stronger relationship ^than a cooperative treatment.^ 

ATI logic can be car}3jied to the level of the individual groups (class- 
rooms, schools). Each classroom becomes a treatment whose characteristics 
may be measured along several dimensions. If classrooms contain pupils with 
similar distributions of entering characteristics (e.g., comparable pretest 
and aptitude distributions), then differences in within-class slopes 
would be anticipated on the basis of knowledge of differences in instruc- 
tional methods and resources. For example, it might be hypothesized 
that there would be flatter slopes for classrooms in which the teachers 
target instruction to improve the performance of lower-abil-ity students 
than in classrooms where students are allowed to learn at their own rate. 
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There are several examples of the consideration of heterogeneous 
wi thin-group regressions (nested within treatments) in the recent ATI 
literature (Corno, 1979; Cronbach, 1976; Cronbach & Snow, 1977; Cronbach 
& Webb-,*TS75; Greene, 1976, 1980; Gustafsson, 1978; Snow, 1976). For 
example, in their multilevel reanalysis of the Anderson (1941) data on 
drill vs. meaningful instruction in arithmetic, Cronbach and Webb (1975) 
found that class-by-class regressions (N=18 classes) varied greatly. 
However, they considered the overall proportion of variation due to 
within-class regressions to be small (4.1 percent for the drill treatment 
and 6.9 percent for the meaningful treatment). Moreover, several unusual 
slopes could be traced to the effects of outliers (anomalous students 
within classes). Cronbach (1976) reached simiTar conclusions in a reanalysis 
of selected data from the Cooperative Reading Study (Bond & Dykstra, 1967).' 

Greene (1976, 1980) investigated the effects of choice (when, how long, 
in what sequence) and no-choice treatments on learning from workbook 
lessons. Both treatments were randomly assigned to half of the students 
in nine fourth and fifth grade classes. The heterogeneity of within 
half-class regressions of outcomes on general ability is striking (see 
Figure 2, p. 84 in Snow (1976) and Figure 1, p. 298 in Greene (1980)). 
While acknowledging the limited stability of slopes based on approximately 
12 observations, both Greene and Snow point out notable within-class 
differences which are consistent with theories about the appropriate 
aptitude-treatment match. 

In the studies cited above, the examination of heterogeneous within- 
group regressions was only of secondary interest. Treatments were 
considered to be discrete; a class is in either treatment A or treatment B. 
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Variability in slopes across classrooms within treatments, represented 
either a nuisance or food for thought. 

Despite its theoretical soundness, the practical mechanics of 
extending current lines, of ATI inquiry to the consideration of classrooms 
as distinct treatments which vary quasi -continuously along a number of 
dimensions are complicated. Each new treatment dimensfon and for that 
matter, aptitude dimension, forces the investigator into the consideration 
of a higher-order interaction (Cronbach, 1975). Though extention via 
the general linear model is seemingly straightforward, current, methods 
of conceptualizing and analyzing, higher-order ATI's lack substantive , 
and statistical power. The requirements for valid and reliable .indicators 
of treatment dimensions may be too difficult to surmount given the present 
state of knowledge in this area. 
Slope Heterogeneity and Context Effects 

A concern for the heterogeneity of wi thin-group relationships is 
fundamental to certain approaches to contextual analysis in sociology 
and political science (Boyd & Iverson, 1979; Valkonen, 1969). Contextual 
analysis is the study^of the effects of properties of groups or collectives 
on individuals (Lazarsfeld & Menzel , 1961).^ 

In its extended form (cf. Boyd & Iverson, 1979), the basic contextual 
model specifies, that an indivi.dual*level dependent variable (Y^j) is a 
function of individual -level explanatory variables (X^-j), their group- 
level counterparts (L ) and the interaction between individual-level and 
group- level variables (X. .L ): 

y,j = a + S,X,j+62ii,_ +P3(X,j!i,.)+e,j (1) 

A typical contextual analysis interpretation is that a nonzero value of 
83 (or a significant heterogeneity of regression in an analysis of 
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covariance) implies that the relationship between X and Y varies as a 
function of, the level of the group on the explanatory variable. 

More generally, Boyd and Iverson (1^79) suggest that the connections 
between within-group relationships and specific properties of groups be 
\ investigated in the two regression equations: 

a. - F(W.) + u. (2) 

= F(W.) + V. . ' (3) 

In (2) and (3), and are the intercept and slope from the within- 
group regressions of Y. . .on X. . arid the W. measure certain properties 
of groups and F(W^) denotes an unspecified functional form of the . 

Boyd and Iverson (1979) consider in detail the casa where W. = L 
and illustrate how various combinations of individual, group, and inter- 
action effects give rise to specific effect estimates in group-level 
analyses of mean outcomes,. They also describe how their form of contextual 
analysis would proceed when group ^^variables other than those based on 
group means are used (Section 3.4). 

It is cHar that the Boyd-Iverson approach to contextual analysis 
recognizes the integral role of vnthin-group slopes in the examination 
of group properties^ and processes. However, their treatment is purely 
didactic. At present there are no actual empirical examples of how such 
an analysis might look. Moreover, the implication .t)f the specific form 
of heterogeneity reflected in (1) needs to be addressed. 
Interactive Schooling Effects of 'Opportunity to Learn 

' Though approached from a different methodological perspective i 
Sorenson and Hallinan*s (1977) reconceptualization of school effects also 
embodies underlying heterogeneity of relationships across schoQls. They 
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view learning as a time-dependent process wherein the variation in the 
•amount of learning achieved is^a function of three concepts — • ability, 
effort, and opportunity to learn. Sorenson and Hallinan offer a speci- 
fication for the interrelations among these three concepts in which the 
effects of ability and effort on learning are constrained by the opportunity 
to learn. They carry the reason^lig a step further to suggest that 
between-school variation in opportunity to learn can lead to heterogeneiity 
among schools in the relationship of ability and effort to learning. 

Sorenson and Hallinan's proposed specification is q differential 
equation model for change in achievement. However, they point out that 
a reasonable representation of their conception of the learning process 
can be found through the estimation, separately for each, school ^(classroom) 
(Sorenson & Hallinan, 1977, p. 278), of the regression of achievement 
after exposure to a learning process of length t on initial achievement 
and individual characteristics representing ability and effort. 

According to Sorenson and Hallinan (p. 278), variation among schools 
in the relationship of achievement at time t to initial achievement 
provides information on the variation in opportunity to learn. Thus, 
they anticipate differences in wi thin-group slopes (of post-achievement 

, on pre-achievementl) which would reflect the interactive effects of 

i 

schooling which ar|ise through differences in opportunities for iLarning. 
To emphasize further their perspective, Sorenson and Hallinan focus 
directly on slope heterogeneity in their empirical example. 
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Examining Slopes as Outcomes " . 

With tne exception of.-Sorenson and iialiinan (i977),'T:he research 
described above has involved natural extensions ot the general linear 
model to incorporate hypothesized heterogeneous wi thin-group regressions, 
ihe multilevel ATI worl^ to date relies mainly on Descriptions and dis- 
cussion of plots of within-group regressions 'accompanied by information 
on variance decomposition (e.g., specific within-class variation ys. 
pooled within-class variation). And, while the modeling of the within- 
group regressions in (2) and (3) is an integral part of the Boyd-Iverson 
contextual analysis, this activity is viewed as secondary to the examina- • 
tion of the general model (equation (1)) and its associated variance 
decomposition. 

Our emphasis departs' from the work cited above in that the within- 
group slope becomes an additional integral variable whose variation is 
'to be explained. That is, we examine the use of wi thin-group slopes of 
outcomes on inputs as a criterion measure in studies of educational effects 
Wiley (1970) was apparently the first to suggest this strategy. . As part 
'of his argument that the collectivity (class, school, etc.) is the 
appropriate unit of analysis in educational evaluation, he commented that 
the focus on the mean level of achievement of the collectivity may be 
too narrow. Wiley suggested that the moments of the achievement distri- 
bution, contrasts between sub-populations and regression coefficients might 
he used as criterion measures for evaluating the differential effect 
of instructional treatments on individual pupils. 

Our reason for considering slopes as outcomes is that there may be 
instructional effects' on the within-group regression of outcomes on 



input, whether there are instructional effects on group mean performance 
or not. If slope effects are present, the analysis should attempt to 
isolate instructional process and practice variables that are associated 
with slope variation. If such variables can be found and alternative 
explanations cannot be rule.d out, then variation in 'slopes becomes an 
important source of information for researchers and policy makers, 
expecially when considered along with effects on other group-level outcomes 



In practice, our empirical investigations have treated within-group 
slopes as one of several outcomes in a between-group analysis. In the 



In the above ^ and denote vectors of input (background) and schooling 

characteristics, respectively. 7^ is the mean of the distribution of 

outcome scores within group i. 3^ is the slope from the regression of 

outcome on input (in this case, a measure of verbal ability) in group i* 

o is the standard error of estimate from the regression of outcome on 
1 

input within group i . 

The y's> o's, and 6's are coefficients from the three regression 
equations based o"n gfoUp-level data. The e's presumably reflect any 



systematic effects of schooling characteristics on slopes. This interpre- 
tation of e's is directly relevant to an elaboration of the components of 
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(4b) 



(4c) 



educational effects on pupil outcomes. 



The actual analysis for slopes is gi two-step procedure whereby 
the within-group slope is estimated separately for each group before 
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being used as an outcome measure in equation (4b). There are multiple 
background factors in the empirical analysis. However, we\concentrate 
strictly on the regression of outcome on a single input (verbal ability) 
. for the sake of explanation and because we believe that the slope 'dif- 
ferences for .other background variables are inconsequenfial after verbal 
ability is controlled. . ' ' 

The choice of standard errors of estimate as a group^ outcome 

represents a departure from earlier use of standard deviations as group 

♦ 

outcomes (e.g.. Brown's Saks, 1975; Burstein, 1980a, 1980b), However, 
our focus on heterogeneous wi thin-group slopes dictates agains^t con- 
sidering the standard deviation as an indicator. Groups with similar 
distributions (e.g., standard deviations) of entering student character- 
istics would be expected to have different outcome distributions if ^ 
instructional practices led to variation in slopes. That is, hetero- 
geneity of regression across classes would result in the heterogeneity 
in standard deviations of outcomes across groups wfth similar standard 
deviations on entering characteristics. Thus, slopes and standard 
deviations are likely to be correlated (as they are here, see Table 1). 
The standard error of estimates, however, can serve as a measure of out- 
come variation across ^groups that is not likely to be related to the 
slopes. As a consequence, the variables that predict -variation across 
groups in the standard error of estimate should be either background 
conditions not adequately reflected in ttie slopes or schooling character 
Istics that influence performance in a manner not systematically related 
to entering characteristics. - . . 

In earlier work (Burstein et al., 1978), hypothetical jdata were 
generated to examine the effects of heterogeneous within-group slopes on 

16 
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the results,.froiii several analytical models for identifying educational 
effects. Under the conditions studied, when differences in wi thin-group 
slopes were systematically related to teacher/class characteristics, 
the s^lopes-as-outcdme analysis (equation 4b) exhibited desirable propertii 
This analysis, when conducted in conjunction with class-mean analysis 
(equation 4a), identified the direction and, to some degree, the severity 
of bias in estimating -teacher effects on class mean outcomes. It also 
suggested that such an analysis might help to disentangle the multiple 
effects of schooling. Below, we carry this activity a step further by - 
examining the effects of schooling on all three group outcomes (means, , 
slopes, standard errors of estimate) in a specific empirical example. 

An Empirical Example 

To make the above discussion more concrete, we elaborate on an 
empirical example using lEA science data on U.S. fourteen year-olds (see 
Comber & Keeves, 1973 for a description of the original study) which 
has been presented previously in somewhat different forms (Burstein, 
1980a, 1980b;. Burstein & Miller, 1980). In this example, a school-level 
analysis (N=107 schools) of factors affecting science test performance is 
considered. The explanatory variables include ascribed background 
characteristics (sex, socioeconomic variables), a concurrent measure of 
verbal ability and two characteristics of science instruction (instruc- 
tional approach and present exposure to science) (see Table 1 for a 
description of variables). Afl explanatory variables are school-level 
averages of individual student responses. Thus, it is possible that the 
group-level effects discussed here are in part simple aggregations of 
individual effects (Alwin, 1977; Boyd & Iverson, 1979; Firebaugh, 1978). 
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The science test was used as the outcome measure and the verbal 
ability test was treated as a proxy for the input measure. That is, 
science scores were regressed on verbal ability scores within each school. 
Longitudinal data with pre and post instruction measures would clearly be 
preferable to the cross-sectional data that were used in this example. 
The verbal ability score is only a crude proxy for a pre-instruction 
measure of studfent ability. It is adequate, however, for the illustrative 
purposes of this paper. 

The mean' science score, the slope of the regression of science on 
verbal ability, and the^standard error of estimate from the regression 
were then taken as the three descriptors of science achievement outcomes 
for a school. These three outcome indices were then regressed on the 
background and explanatory variables. Table 1 provides descriptive 
data on the variables included in the regression analysis. Note that 
within-school slopes are strongly related to the standard deviation of 
science scores (r=.52), but are weakly related to school mean science 
score (.18) and the standard error of estimate (.10). 



Table 1 



Table 2 presents the school -level regressions of means, slopes, and 
standard errors of estimates on school means on background and schooling 
characteristics. The same set of explanatory variables has been used 
in all three analyses for comparison purposes though in theory, different 
characteristics could be expected to influence the different indicators 
of group outcomes. 

Table 2 
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. Substantive interpretations of data such as those in Table 2 would 
presumably deal with each outcome in turn. In the present example, 
the"mod§l for 'explaining .variation in school mean science performance 
displays a relatively good fit. The coefficients for the two schooling 
characteristics suggest that for schools at a given level on background 
characteristics, performance is higher when students are^Teceivlng'more 
science instruction and when science instruction emphasizes discovery 
methods. 

The examination of the effects of schooling characteristics on 
wi thin-school slopes (table discussed below) and standard errors of 
estimate does appear to elaborate how a school's science performance is 
affected. The coefficients from the standard error of estimate model 
suggest, that variation of individual performance about the within-school 
regression is greater in schools in which students report a large number 
of books in the home and high exposure to science Instruction. 

We can think of at least two mechanisms that might account for the 
science instruction effects on the standard errors of estipiate. First, it 
is possible that in schools with more opportunity for exposure to scjence 
instruction, students with similar verbal skills may vary in the degree to 
which they forego or take full advantage of increased opportunities. As a 
consequence, there would be large differences in science performance of 
students with similar verbal abilities simply due to differences in actual 
versus possible exposure to science instruction. Alternatively, schools 
with high levels of science Instruction jiay require students with similar 
verbal skills to receive more science instr,uction than in schools with 
lower levels of science instruction regardless of s^tudents* interest in 
science or aptitude for science study. In such schools, variation in 

4 
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science scores for students with similar vejfbal abilities would be 
expected to the degree that science aptitude and interest mediate per- 
forma'nce. Under either condition (differential opportunity, or low _ 
aptitude and/or interest with similar opportunity), it, is reasonable to 
'expect substantial variation of science performance among students with 
similar verbal abilities in schools offering greater opportunities. 

The effects of the school ihg variables instructional approach and 
amount of science instruction on the slopes provide ah indication of 

■ ' r 

which students benefit 

practices. The relationship of a student *s verbal score to their 

science score is apparently stronger in schools with high (verbal) ability 

students', 'Wtth a high proportion of male students, with more exposure to 

science and with greater emphasis on discovery approaches. To grossly 

simplify matters, given two schools with the same sex ratio and overall 

mean verbal ability score/ the difference in performance between a student 

with a lower verbal ability and one with higher verbal ability would be 

expected to be greater in-vthe school offering more science instruction 

and utilizing a discovery approach. 

To highlight the contrasts, expected differences in science scores 

were estimated for hypotfietical students at various levels of verbal 

ability in schools with below average, average, and above average levels' 

of discovery approach to instruction (EXPLORE) and amount of science 

instruction (SCIINST) and at the'average on all other variables (see 

Table 3). For the extreme cases in the table [(+1,+1) and (-1,-1)], 

the difference in science scores between a school's lower and higher 

verbal ability students is expected to be 1/2 of a standard deviation 

{{9.7 - 5.7)/8.1). " 

20 
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Table 3 



The results in Tables 2 and 3 suggest that greater opportunities 
for science study and a discovery emphasis in science instruction do 
lead to higher pupil performance on the average. Instruction which em- 
phasizes student self-direction (selection) of learning goals and inductive 
problem-solving tends to magnify pre-existing differences in pupil 
skills. Higher ability students tend to make more appropriate choices 
and learn better under these conditions than lower ability students. 
The' steeper within-groilp-slopes with greater opportunities *for exposure 
to science instruction and with greater emphasis on individual exploration 
are consistent with results from research on informal /open 
individually guided/less structured instruction (e.g., Bennett, 1976; 
Peterson, 1977; Stebbins et al., 1977). 

The above discussion probably overemphasizes the practical . Impact 

> 

of the schooling characteristics. In order to gain a better understanding 
of the 'source of the impact, the within-school regressions for the ten 
schools with the highest EXPLORE ^ind SCI INST scores and the ten schools 
with the lowest are examined. Figures 1 and 2 contain lines showing the 
regression of science scores on verbal ability for the high and low 
EXPLORE, and high and low SCIINST schools, respectively. The endpoints 
of the line'for each school coincide with points plus or minus one 
within-school standard deviation from the school's mean on verbal ability. 

Figures 1 and 2 
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There appear to be some discernible patterns. In Figure 1,- four 
out of the f^ve schools with the highest slopes (E, K, M, L, and P) were 
high EXPLORE schools while all five of the lowest slopes (A, B, D, G, H) 
were low ^PLORC' schools. The high EXPLORE schools had a mean slope of 
.93 with/a standard deviation of .40 while the low EXPLORE schools had a 
mean slppe of .48 with a standard deviation of .44, a statistically 
signi-^icant difference (p < .03). The differences are most marked 



for /chools with average mean verbal ability. 

I The plots for schools with high and low levels of science instruction 
- ai^fe less clear than those for EXPLORE schools (the EXPLORE and SCIINST 
schools are not the same though they overlap). There are several schools 
with low verbal ability and high science exposure (e.g., K, P, T) or 
high verbal and low science exposure (e\g., H, I), so contrasting 
exposure at a given level of ability is less informative. The wi thin- 
school, slopes in low science instruction (SCIINST) schools tend to increase 
with ability while the slopes in the high "instruction schools seem to 
vary less systematically. The mean and standard deviation of the slope 
distributions were .87 and .34, respectively, for high SCIINST schools and 
.65" and .43 for low SCIINST schools, a statistically nonsignificant 
difference (p < .22). Xlearly, there is a need for a more ftne-grained 
look at schools with specific combinations of EXPLORE, SCIINST,' and 
verbal ability. » 

Problems with Slopes as Outcomes 

Despite their theoretical and empirical appeal, the use of within- 
group slopes as outcomes Is fraught with problems. The probt6ms"cover 
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a'broad research universe: statistical, empirical, substantive, and 
coirmunicative. We briefly discuss each type below. 
Statistical Problems 

The mathematical properties of slopes as outcomes are not well 
understood. We are essentially treating the wi thin-group slopes as a 
random variable with an unknown underlying distribution. Our logic is 
somewhat related to econometric work on random coefficients regression 
models (e.g., Akkina, 1974; Maddala, 1977; Swaniy, 1970) though econo- j 
metricians typically deal with the case where slope variation is a random'^ 
variable unrelated to the explanatory variaibles in the model (for an 
exception, see Hanushek, 1974). , 

The criticism that within-group slopes should not be treated as 
random variables is troubling, but certainly not fatal. There are too 
many instances in behavioral research where sensible analytical work has 
been conducted without mathematical confirmation of the appropriateness 
of the distributional assumptions in the measurement of a crucial 
variable. Any score which is a simple sum of other scores is also sub- 
ject to uncertainties. The final line of defense against the statistical 
criticism is that like any other measure of unknown properties, it is 
necessary to have a sound- theoretical rationale for using it, to demon- 
strate its empirical utility, and seek to identify and disconfirm any 
counter-interpretations to theoretical and empirical evidence. 
Empirical -Problenis 

' The .empiri/al problems with studies of slopes as outcomes are 
penerally the same '^as with any investigation of regression. models. Group- 
to-group variation in slopes are notoriously sensitive to inadequacies 
,ind anomalies |in the data. In general, regression coefficients are 
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strongly affected by measurement errors in the regressors, ceiling (floor) 
effects, outliers, small numbers of observations, and multi col linearity • 
In fact, some researchers view unusual group slopes as possible indica- 
tions of outliers or ceiling effects especially when generated from 
data on classrooms. 

We hc(ve no quarrel with these empirical cphcerns about slopes. In 

/ 

/ 

fact, investigators who wish to treat slopes as outcomes should examine 
the scatterplots and descriptive statistics for the individual classes 
or schools for outliers and ceiling effects, as an essential precautionary 
measure. Outliers and floor and ceiling effects were excluded as threats 
to our interpretation of the lEA data. In most cases, the slope accurately 
characterized the bivariate distributipn of a school's science and verbal 
scores. And, while sample sizes were relatively small in some schools 
(as low as 10), there was no clear relationship between slopes and sample 
size or between standard errors of estimate and size. Any attenuation 
problems due to measurement errors in the regressors is minor since the 
psychometric properties of the verbal scof^e used above are very good 
across the whole sample (internal consistency coefficients above .9). 
Moreover, there is no evidence that measurement error problems are more 
severe in some schools than in^ others which would have to be the case 
to challenge any identified effects on slopes as outcomes. 
Substantive Problems 

In earlier sections on the theoretical rationale for examining slope 
heterogeneity, we focussed on schooling characteristics (instructional 
approach, »etc.) as the source of slope differences. Realistically, such 
interpretations are reasonable only for groups with comparable distributions 
of entering characteristics. It is highly likely that slope heterogeneity 
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in studies of naturaVly occurring educational groups can be more readily 
explained by selection effects (Alwin, 1976; Burstein, 1980a; Cronbach, 
1976). Typically, classrooms and schools vary in the mechanisms which 
guided their formation (community wealth, pupil ability, etc.) and in 
their composition of student skills, .background and attitudes. The 
flip-side of the ATI coin is that one can expect a different array of 
outcomes from a single treatment for classrooms (schools) which vary in 
their student composition. Heterogeneous vs. homogeneous ability grouping 
and. high ability vs. low ability combinations would certainly lead one 
to expect different treatment outcomes and would itself be of substantive 
interest (Webb, 1980). 

The analyst needs to be keenly aware of selection and composition 
at every stage of a multilevel investigation. In the- present example, 
composition effects on slopes as measured by school means on verbal 
ability are certainly strong. The effects of the heterogeneity of 
verbal ability within the school on the slope are weaker, but significant 
nonetheless. However, composition effects as measured did not wipe out 
the more substantively interesting effects of science instruction charac- 
teristics. 

Another substantive problem arises when the various indicators of 
group-level outcomes are highly correlated. Parsimony alone would argue 
that precedence should be given to simpler explanations. For example, 
one might argue that analyses of school means captures all of the 
meaningfully interpretable effects and presumed effects on sldpes 
and standard errors of estimate are actually redundant 
with effects on means. Again, however, finding interpretable differences 
in patterns of effects across indicators is the best way to make a case 
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for separate examinations of other indicators besides group means. In 
our present example, we feel fairly confident that the effects on slopes 
cannot be solely explained by effects on group means. 
When a school's outcome mean and standard deviation are 
included as explanatory variables in the mociel with slopes as outcomes, 
the significant effects of EXPLORE and SCIINST are only marginally altered 
even though the overall proportion of explained variation is more than ^ 
doubled. 

Coimiuni cation Problems 

Cofimuni cation problems refer to the whole class of difficulties in 
presenting a theory and describing research results in a manner that 
others will understand. This is a difficult task in multilevel analysis 
models, especially those which try to capture wi thin-group phenomena 
by examining the antecedents of slope heterogeneity. Even in the simplest 
cases.,^he reader is asked to envision patterns in the distributions of 
lines across groups and try to relate these patterns to characteristics 
of the education in the groups. Anyone uncomfortable with either ATI 
reasoning or the conceptual distinctions possible with multilevel data 
is bound to balk when asked tq understand models which combine the two 
lines of thought. 

We have no simple answer to the comnuni cation problem in research 
an3'^ evaluations which involve multilevel educational data. WhiVe little 
work on multilevel problems was done in educational research between 
Wiley's conference presentation (actually presented in 1967, but not 
published until 1970) and Haney's (1974) paper on Project Follow Through, 
there has been a virtual flood of interest in recent years, traceable 
mainly to Cronbach's (1976) report. While the level of cognizance of 
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multilevel problems is fairly high at present, in education as well as 
other social science research,' time and experience are the keys to 
either the demise of the concerns or the bridging of the communications 
gap. 
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NOTE 



1. In the structured/unstructured comparisons, the presence of 
structure presumably benefits the less able student by providing addition- 
al tools to tackle the tasks, thereby reducing the dependence of per- 
formance on prior ability (Peterson, 1977). In the competitive/cooperative 
example, the competitive environment offers no incentive to the more able 
student to help the less able thereby excerbgiting pre-existing ability 
differences which presumably reflect prior competitiveness and motiva- 
tion (Hanelin, 1978; slavin, 1977) • 
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Table 1. Descriptive statistics for slopes and other school-level variables in the lEA science data for the U.S. (N«107 schools). 



Variables* 
Slope 


Slope 


Science 
Mean 

• 


• 

Science 
$.D, . 


Standard ' 
Error of 
.Estimate 


Verbal 
Mean 


Verb^il 
S,D, 


Sex 


Father's 
Occupatlonv 


Books 
In the 
Home 


Instructional 
Approach 


Science 

Ins t^u^*t^nn 


Science Mean 


*18 


















• 




Science 

s»o» 


• 52 


.34 




















Standard Error 
of Estimate 


.10 


.30 


.85 




• 










• 




Verbal Mean 


.24 


.78 


i21 


.16 






* 










Vorhal 
Vci Ud 1 

S,D» 


-.12 


.19 


»35 


.22 


-.15 












• 

• 


OCA 


-17 


-.CO 


1 a 
r » 16 


-.10 


-.11 


-.05 












Father's 
Occupation 


.07 


.64 


,26 


.23 


' .53 


.14 


-.09 






• 


• 


Books In the 

Homo * 




./o 


f 

» oO 


_ 

.31 


.58 


.17 


-.08 


.67 








Instructional 
Approacn 


.25 


.29 




.15 


.17 


.24 


.01 


.25 


.29 




• 


Science 
Instruction 


.23 


.03 . 


;i5 


.16 


-.05 


-.11 


.09 


-.17 ^ 


-.03 


-.09 




Mfan 


.76 


57.28 


.6.92 • 


5,87 


27.54 


4.45 


K52 


6.14 


4.55 


9.84 


1.59 


Standard 
Deviation 


.38 


4.43 ^ 


1.49 


1.35 


2.41 


1.00 


• 16 


1.25 


• 37 


1.60 


.61 



3, 



The variables are the withln-school regression of Science Total Score on Word Knowledge Total score (slope), school means and 
standard deviations on Total* Science and Total Word Knowledge, and school means on sex of- student, father's occupation, books 
in the hon'.e, degree to which students report the use of discovery methods In science study, and a composite of student reports 
Pfni! !?KLI?L^ <^ '^KV^ ' ?^^" homework In all science courses. The scltnct total scores used In this analysis have been transformed 
wo?dS^ • '^^^^^ curvlllnearlty evident in the pverall regression ff sclence^"^^^^^^^ 
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Table 2. School -level regressions of means, slopes, and standard errors of estimate on school 
means on background and schooling characteristics for science achievement of U.S. 
14-year 'Olds (N = 107 schools). / 



Unstandardized Standardized 



Expl anatory 
Variables 


Mean 


Slope 


Standard 
Error of 
Estimate 


Mean 


Slope 


Standard 
Error of 
Estiinate 


Word 

Knowledge 


.897" 
(8.07)^ 


.040 
j2.25) 


-.027 
(.40) 


.487 


.254 


-.047 


Sex 


-5.289 
(3.91) 


-.427 
(1.99) 


-.845 
(1 .06) 


-.188 


-.179 


-.099 


Father's 
Occupation 


'.583 
(2.43) 


-.019 
(.49) 


.086 
(.61) 


.164 


-.063 


.079 


Books in the 
Home 


3.569 
(4.18) 


-.058 
(.43) 


.966 
(1.92) 


.294 


-.057 


.262 


'Instructional 
Approach 


.248 
(1.78) 


.062 
(2.83) 


.071 
(.86) 


.089 


.265 


.084 


Science 
Instruction 


.811 
(2.28) 


.168 
(2.98) 


.442 
(2.10) 


.112 


.273 


.200 


Constant 


17.093 


-.178 


- 1.566 
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.77 ' 


.21 


.15 









statistics in parentheses * 
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Table 3. Predicted science scores for students at various levels of verba' 
abili^ty from schools with different levels of exposure to science^' 
(SCIINST) and emphasis on discovery approach to instruction (EXPLORE)/ 



Level 
EXPLORE 


on 

SCIINST 


r reu 1 ctcu 
Wi thin-School 
Slope° 


Predicted Science 
Score When Verbal 
Score = 
22 27 32 


u 1 Txerence ce tween 
Prediction at Verbal 
Score of 22 and 32 ^ 


+1 


+1 


,97 


51.91 


56.76 


61.61 


9.70 


+1 


0 


. ,87 


52.46 


56.81 


61.16 


8.70 


0 


+1 
0 


.87 


52.46 


56.81 


61.16 


8.70 


0 


,77 


.53.07 


56.87 


-60.67 


7,60 


0 


-1 


.67 


53.57 


56.92 


" 60.27 


6.70 


-1 


0 


.67 

1 


53.57 


56.92 


60.27 


6,70 


-1 


-1 


.57 


54.12 


56.97 


59.82 


5.70 



The levels are for schools with one standard deviation above the mean (denoted 
by +1), at the mean (0), and one standard deviation below the mean (-1) on combina- 
tions of EXPLORE and SCIINST. For example, a hypothetical school with .the combina- 
tion (+1,+1) has an EXPLORE score of 11.44 and a SCIINST score of 2.20, 

'^These slopes are predicted from the' model for the within-class slope in Table 2 
when all explanatory variables except EXPLORE and SCIINST have been set at their 
mean. 

^The between-student mean and standard deviation of Science Test scores are 
approximately 57.28 and 8.1 respectively. 
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Figure 1. 

Plots of w1 thin-school regressions of science scores on verbal ability for ten 
lowest and ten highest schools on mean emphasis on discovery methods (EXPLORE). 
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.and ten highest schools on mean exposure to science Instruction (SCIINST). 



Plots of wi thin-school regressions of science scores on verbal ability for ten lowest 
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